Generic Compilation Schemes for Simple Programming Constructs Generic Compilation Schemes for Simple Programming Constructs
نویسندگان
چکیده
states MState are equal global and local program pointer refer to the same instruction and abstract and concrete stacks correspondand concrete stacks correspond Correspondence on the stacks is speci ed by the predicate Generic Compilation Schemes for Simple Programming Constructs eq stack mp s s RECURSIVE bool depth s depth s AND not empty s IMPLIES not empty s AND LET mc pc top s IN startadr top s length program compile mp AND finaladr top s length program compile mp AND extract program compile mp startadr top s finaladr top s compileC mc AND not empty pop s IMPLIES not empty pop s AND proj top pop s startadr top pop s retadr top s AND eq stack mp pop s pop s MEASURE depth s The invariant is given by invar mp c lc Conf c lc Conf bool proj c proj c AND abstract states MState are equal not empty proj c IMPLIES pc s correspond not empty proj c AND startadr top proj c proj top proj c proj c AND eq stack mp proj c proj c stacks correspond The proof obligations are the same as presented in the last subsection First one step correspondence has to be established then by rule induction the correctness of the closures is proved The nal result is that the modi ed machine is a re nement of the old one main result machine simulation LEMMA Ip compile mp ms ms IMPLIES Ip mp ms ms Speci c Compilation Processes In order to illustrate the applicability of the generic compilation theories two speci c compilation processes are presented In particular we describe the compilation of a simple imperative language consisting of expressions and statements into code of a stack machine and a one address accumulator machine We start with de ning syntax and semantics of our simple imperative language For de ning syntax and semantics of expressions the parameters in are used abstracting from the concrete type of expression values and from the available set of unary and binary operators Generic Compilation Schemes for Simple Programming Constructs parameters used for specifying the source language VarId TYPE PId TYPE Value TYPE Unop TYPE Binop TYPE MUnop Unop Value Value MBinop Binop Value Value Value Value denotes the type of source values VarId the type of identi ers Unop Binop the available set of unary and binary operators and their semantics MUnop resp MBinop Abstract datatype Expr and an evaluation function eval then de ne syntax and semantics of expressions where the state SState is de ned as a mapping from identi ers to values semantics of expressions eval e Expr s SState RECURSIVE Value CASES e OF const val val varid name s name unopr op arg MUnop op eval arg s binopr op left right MBinop op eval left s eval right s ENDCASES MEASURE e BY Since boolean expressions are treated in a similar way as expressions we do not de ne them explicitly but instead suppose that an uninterpreted type BExp together with an evaluation function eval bexp BExp SState bool is given Syntax and semantics of statements are de ned by importing the generic theories for simple statements and control structures import syntax and semantics of simple statements IMPORTING simple statements VarId Expr Value eval import syntax and semantics of control structures IMPORTING ctrlstruc BExp SState PId eval bexp SimpleStatement ss meaning In the following subsection we deal with the compilation of this language into stack ma chine code then in Section its compilation into code of a one address machine is described For both machines compilation of expressions is outlined explicitly while com pilation of statements is carried out by instantiating the generic theories A Stack Machine Compilation We consider a stack machine which is parameterized with respect to the type of memory addresses the type of machine values and the set of available unary and binary ALU operations and their semantics It includes instructions for loading a literal onto the stack LIT Generic Compilation Schemes for Simple Programming Constructs loading the contents of a speci c memory cell onto the stack LOAD applying unary and binary operators UNOP BINOP and storing the stack s top element into memory STORE The memory is a mapping from addresses to values and the machine state consists of the stack and the memory combined in a record type MachineState MachineState TYPE stack Stack mem Mem The e ects of each instruction are speci ed by function onestep in litf v Value s deterministic singleton s WITH stack push v stack s loadf a Addr s deterministic singleton s WITH stack push mem s a stack s onestep i Instr PartialFunction MachineState MachineState CASES i OF LIT v litf v LOAD a loadf a UNOP op uopf op BINOP op bopf op STORE a storef a ENDCASES For de ning the semantics of a code sequence the generic interpreter is imported IMPORTING simple interpreter Instr MachineState onestep Compilation Consider now the compilation of expressions into stack machine code We suppose given a predicate representable denoting the set of source values which are representable on the target architecture The compilation function may only compile constants which have representable values We further suppose that a bijection valmap from target values to representable source values is given In addition an injective function idmap mapping identi ers to memory addresses is required Generic Compilation Schemes for Simple Programming Constructs compile e Expr RECURSIVE deterministic Code CASES e OF const val IF representable val THEN singleton LIT inverse valmap val Instr ELSE emptyset ENDIF varid name singleton LOAD idmap name Instr unopr op arg compile arg singleton UNOP op Instr binopr op left right compile left compile right singleton BINOP op Instr ENDCASES MEASURE e BY Correctness of compilation is stated using predicate correct in An abstraction function statemap mapping machine states to program states is de ned using valmap and idmap statemap ms MachineState SState LAMBDA v VarId valmap mem ms idmap v correct e Expr c Code bool FORALL start final MachineState interprete c start final IMPLIES nonempty stack final AND eval e statemap start valmap top stack final AND statemap final statemap start To establish correctness of expression compilation one has to prove correctness THEOREM compile e c IMPLIES correct e c The proof is by induction on the structure of e The base cases for constants and identi ers as well as the induction step for unary operators can be proved easily To prove the induction step for binary operators one rst has to establish an invariant interprete invariant which states that when interpreting the compiled code in the nal state the stack contains an additional element This ensures that executing code for the second subexpression does not e ect the value of the rst one i e the value of the rst subexpression is preserved The proof of the invariant is also by structural induction on e interprete invariant LEMMA compile e c AND interprete c start final IMPLIES EXISTS v TarValue stack final push v stack start Consider now compilation of statements In order to utilize the generic compilation theory for simple statement compilation described in Section speci c values must be provided for the abstract parameters More speci cally Generic Compilation Schemes for Simple Programming Constructs The output function accesses the top element of the stack Hence the output is only de ned in states in which the stack contains at least one element Since the access location of values is constant for this machine top of stack parameter T in the generic theory is not required and instantiated with a default type unit consisting of exactly one element one access of values outputdefd u unit ms MachineState bool nonempty stack ms output u unit ms outputdefd u TarValue top stack ms Code for storing values into memory at a speci c address is given by the single STORE instruction storing values STORE code u unit a Addr Code STORE a Instr To match the signature of the parameter compileExpr we simply extend the com pilation function compile as follows compilation of expressions compileExpr e deterministic Code unit compile e one Using these de nitions and let target memory denote the state record selector mem the generic theory can be imported import compilation of simple statements compile assign VarId Expr Value eval Instr MachineState onestep Addr TarValue unit outputdefd RegFile STORE code target memory representable valmap idmap compileExpr statemap Importing this theory four assumptions are generated see and Assump tion expression compilation correct is discharged using theorem correctness above Assumption interprete store is proved easily by unfolding de nitions Assumption symtab and memory is trivial and nally the proof of assumption statemap and memory requires injectivity of idmap Consider now compilation of control structures In order to use the generic theory for compiling control structures into basic blocks a read function has to be de ned The read function for stack machine is simply a pop operation on the current stack In addition an output function with range type bool is required access of truth values read msdef outputdefd MachineState msdef WITH stack pop stack msdef output bool msdfd outputdefd bool Generic Compilation Schemes for Simple Programming Constructs We will not consider compilation of boolean expressions explicitly since it closely follows the compilation of expressions We suppose given a compilation function compileBExpr for boolean expressions satisfying a correctness assumption bexp comp correct compileBExpr b BExp deterministic Code bexp comp correct AXIOM FORALL b BExp c Code compileBExpr b c IMPLIES FORALL start final MachineState interprete c start final IMPLIES nonempty stack final AND eval bexp b statemap start output bool final AND statemap final statemap start Importing the generic theory for compiling control structures into basic blocks three assumptions have to be proved It must be proved that boolean expression compilation and simple statement compilation are correct Using the axiom above and the generic compilation theorem simple statement comp correct respectively these obligations can be discharged easily The third obligation states that read must have no e ects on corresponding source states It is proved automatically using GRIND Finally the generic theory for linearization is imported Since this theory does not contain assumptions no proof obligations are generated Finally the following theorem which states correctness of basic block compilation and linearization of control structures can be proved easily using the generic theorems correctness and linearization correct compilation of source programs is correct stmts compile correct THEOREM FORALL p source program FORALL start final MachineState compile defined p AND Ip cp compile p start final IMPLIES P p statemap start statemap final Compilation into a One Address Machine Our simple one address machine is parameterized in the same way as the stack machine described in the last subsection Addr denotes the type of memory addresses MValue the type of values Unop Binop munop sem mbinop sem the available set of unary and binary operators and their semantics Here the machine state consists of an accumulator the memory a mapping from addresses to values and a ag of type bool The machine does not contain general registers There are instructions for loading a literal into the accumulator LIT loading the contents of a speci c memory cell into the accumulator LOADA applying unary and binary operators UNOP BINOPA and storing the content of the accumulator into memory STOREA Generic Compilation Schemes for Simple Programming Constructs MState TYPE ac MValue mem Addr MValue flag bool The e ects of each instruction are speci ed by function one step in all arithmetic operations are carried out using the accumulator effects of instructions one step i Instr PartialFunction MachineState MachineState CASES i OF SETFLAG flg singleton ms WITH flag flg LIT v singleton ms WITH ac v LOADA a singleton ms WITH ac mem ms a UNOP uop singleton ms WITH ac munop sem uop ac ms BINOPA bop a singleton ms WITH ac mbinop sem bop ac ms mem ms a STOREA a singleton ms WITH mem mem ms WITH a ac ms ENDCASES For de ning the semantics of a code sequence the generic interpreter is imported IMPORTING simple interpreter Instr MachineState onestep Compilation As for the stack machine compilation we suppose given a predicate representable denoting the set of representable source values a bijection valmap from target values to representable source values and an injective memory mapping idmap from identi ers to target addresses Since this machine does not have a stackmechanism temporary locations for storing intermediate values have to be allocated More speci cally compiling a binary expression bop e e consists of rst generating code for e saving this value into a temporary location then generating code for e and code for the operator bop which then accesses the values from the temporary location and the accumulator The compilation of expressions thus starts with a set of available temporary locations from which required temporaries are taken If there are not enough temporaries the compilation function is unde ned i e returns the empty code set Type tempset speci es the type of such a set It is required that locations onto which identi ers are mapped by idmap are not used as temporaries For allocating locations we suppose given a function ralloc which selects a free location from a nonempty set of temporaries tempset TYPE fM set Addr FORALL id Ident not member idmap id M g The complete compiling function is given by compile in Generic Compilation Schemes for Simple Programming Constructs compilation of expressions RT TYPE Code tempset compile e Expr free tempset RECURSIVE deterministic RT CASES e OF const val IF not representable val THEN emptyset RT ELSE singleton RT LIT inverse valmap val Code free ENDIF varid name singleton RT LOADA idmap name Code free unopr unop e LET m compile e free IN IF empty RT m THEN emptyset RT ELSE LET code rest select m IN singleton RT code UNOP unop free ENDIF binopr bop e e LET m compile e free IN IF empty RT m THEN emptyset RT ELSE LET code e free e select m IN IF empty Addr free e THEN emptyset RT ELSE LET temp ralloc free e IN LET m compile e remove temp free e IN IF empty RT m THEN emptyset RT ELSE LET code e free e select m IN singleton RT code e STOREA temp Code code e BINOPA bop temp Code free ENDIF ENDIF ENDIF ENDCASES MEASURE e BY A notion of correctness for this compilation is given by predicate correct compExpr in Informally this predicate states that if the interpretation of the expression code is de ned the value of the expression can be accessed by reading the contents of the accumulator and the state transition is not vissible on the source state i e locations which are associated with identi ers do not change notion of correctness correct compExpr e Expr code Code bool FORALL start final MState interprete code start final IMPLIES valmap ac final eval e statemap start AND statemap final statemap start Thus for proving the correctness of expression compilation in this sense one has to prove Generic Compilation Schemes for Simple Programming Constructs correctness of expression compilation expr compilation correct THEOREM compile e free result IMPLIES correct compExpr e proj result The proof is by induction on the structure of expressions The base cases constants and identi ers as well as the induction step for unary operators are proved easily Here the most interesting case is the induction step for binary operators As for the stack machine compilation one rst has to prove an invariant in order to accomplish the induction step for binary operators This invariant states that locations which are not contained in the initial set of temporary locations do not change when executing the code i e only temporaries may change The proof of the invariant is also by induction on e Consider now compilation of statements The speci c values provided for the abstract parameters in the generic theory for simple statement compilation consist of The output function accesses the accumulator As in the last subsection the access location of values is constant and thus parameter T of the generic theory is not required and instantiated with the unit type access of target values outputdefd u unit ms MachineState bool true output u unit ms outputdefd u MValue ac ms Code for storing values into memory at a speci c address is given by the single STORE instruction STORE code u unit a Addr Code STOREA a Instr To match the signature of the parameter compileExpr we have to change the com pilation function for expressions compile as follows where t set is a xed set of temporary locations compilation function of expressions used for instantiation compileExpr e Expr deterministic Code unit LET result compile e t set IN IF empty RT result THEN emptyset Code one ELSE LET code rest choose result IN singleton Code code one ENDIF Using these de nitions and let target memory denote the state record selector mem the generic theory can be imported Generic Compilation Schemes for Simple Programming Constructs import compilation of simple statements compile assign Ident Expr SrcValue eval Instr MState one step Addr MValue unit outputdefd RegFile STORE code target memory representable valmap idmap compileExpr statemap Importing this theory four assumptions are generated see and Assumption expression compilation correct is discharged using theorem expr compilation correct above Assumption interprete store is proved easily by unfolding de nitions Assumption symtab and memory is trivial and nally the proof of assumption statemap and memory requires injectivity of idmap Next we deal with the compilation of control structures As in the last subsection we do not consider boolean expression compilation explicitly and suppose given a compilation function compileBExpr satisfying the assumption bexp comp correct in The boolean output function output bool tests the ag in the current state correctness assumption for boolean expression compilation bexp comp correct AXIOM FORALL b BExp c Code compileBExpr b c IMPLIES FORALL start final MState interprete c start final IMPLIES eval bexp b statemap start output bool final AND statemap final statemap start Here outputdefd is instantiated with the constant true function and read is instanti ated with the identity on states The generic theory for compiling control structures into basic blocks is then imported import compilation into basic blocks IMPORTING c bb BExp SState PId eval bexp SimpleStatement ss meaning Instr MState one step outputdefd output bool read statemap compileBExpr compile simpleStmt All generated assumptions are proved in the same way as described in the last subsection Finally the generic theory for linearization is imported which enables to prove the main correctness conjecture compilation of Statements is correct stmts compile correct THEOREM FORALL p source program FORALL start final MState compile defined p AND Ip cp compile p start final IMPLIES P p statemap start statemap final Generic Compilation Schemes for Simple Programming Constructs
منابع مشابه
Code generation techniques for the task-parallel programming language Spar
In this paper we describe a compilation scheme to translate implicitly parallel programs in the programming language Spar (an extension to Java) to efficient code for distributed-memory parallel computer systems. The compilation scheme is formulated as a set of transformation rules. In Spar, the language constructs for parallelization have been designed for comfortable use by the programmer, no...
متن کاملGeneric Reverse Compilation to Recognize Specific Behavior
This extended abstract of the doctoral thesis introduces the recognition of specific behavior by generic reverse compilation. The generic reverse compilation is a process that transforms executables from different architectures and object file formats to the same high level language. This process is covered by a tool Lissom Decompiler. For a purpose of behavior recognition, we introduce Languag...
متن کاملA Practical Approach to Type Inference for EuLisp
Lisp applications need to show a reasonable cost-beneet relationship between the ooered expressiveness and their demand for storage and run-time. Drawbacks in eeciency, apparent in Lisp as a dynamically typed programming language, can be avoided by optimizations. Statically inferred type information can be decisive for the success of these optimizations. This paper describes a practical approac...
متن کاملTowards Automatic Support of Parallel Sparse
In this paper, we present a generic matrix class in Java and a runtime environment with continuous compilations aiming to support automatic parallelization of sparse computations on distributed environments. Our package comes with a collection of matrix classes including operators of dense matrix, sparse matrix, and parallel matrix on distributed memory environments. In our environment, a progr...
متن کاملProgramming with Patterns
Language support for object-oriented programming with patterns is provided. Thereby, designs making use of design patterns can be implemented in a more direct and traceable way. The essential language constructs are nested classes and a kind of superposition for class structures. A corresponding experimental programming language PaL is discussed. The current implementation is based on a compila...
متن کامل